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Detailed protocol 
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Approximation of relative abundance of each sequence type (ST) from metagenomic data was performed using the workflow called 'Bayesian Identification 
of Bacteria (BIB)' (1), and followed recommended workflow from the author's github (https://github.com/PROBIC/BIB). 


First, the genome assembly of each ST and metagenomic data are required. As an initial step, core-alignment of all the STs need to be extracted. This 
process requires progressiveMauve (2). 

$ progressiveMauve --output=full_alignment.xmfa ST_X_assembly.fasta ST_Y_assembly.fasta ST_Z_assembly.fasta 

$ stripSubsetLCBs full_alignment.xmfa full_alignment.xmfa.bbcols core_alignment.xmfa 500 4 


And change xmfa format to fasta, and remove gaps, which are generated during the alignment. 
$ perl xmfa2fasta.pl --file core_alignment.xmfa > core_alignment.fasta 
$ sed 's/-//g' core_alignment.fasta > core_alignment_gapless.fasta 


Once the fasta-formatted core alignment is ready, the metagenome data need to be aligned to the core alignment (core_alignment_gapless.fasta), using 
Bowtie2 (3). 

$ bowtie2-build core_alignment_gapless.fasta core_alignment_gapless 

$ bowtie2 -x core_alignment_gapless -U metagenome_reads.fastq -S metagenome_aligned.sam -a 


Then, estimate the abundances of different STs using the alignment (metagenome_alignment.sam) and the core alignment (core_alignment_gapless.fasta), 
using BitSeq (4) 

$ parseAlignment metagenome_aligned.sam -o alignment_info.prob --trSeqFile core_alignment_gapless.fasta --trinfoFile genome_info.tr --uniform -- 
verbose 

$ estimateVBExpression -o final_abun lignment_info.prob -t genome_info.tr 


After successfully finish the pipeline, user should have a file named ‘final_abun.m_alphas', which illustrates the abundance of each ST in the metagenome 
data. 
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